Neuromorphic Model for Sound Source Segregation
نویسندگان
چکیده
Title of dissertation: NEUROMORPHIC MODEL FOR SOUND SOURCE SEGREGATION Lakshmi Krishnan, Doctor of Philosophy, 2015 Dissertation directed by: Professor Shihab Shamma Department of Electrical and Computer Engineering While humans can easily segregate and track a speaker’s voice in a loud noisy environment, most modern speech recognition systems still perform poorly in loud background noise. The computational principles behind auditory source segregation in humans is not yet fully understood. In this dissertation, we develop a computational model for source segregation inspired by auditory processing in the brain. To support the key principles behind the computational model, we conduct a series of electro-encephalography experiments using both simple tone-based stimuli and more natural speech stimulus. Most source segregation algorithms utilize some form of prior information about the target speaker or use more than one simultaneous recording of the noisy speech mixtures. Other methods develop models on the noise characteristics. Source segregation of simultaneous speech mixtures with a single microphone recording and no knowledge of the target speaker is still a challenge. Using the principle of temporal coherence, we develop a novel computational model that exploits the difference in the temporal evolution of features that belong to different sources to perform unsupervised monaural source segregation. While using no prior information about the target speaker, this method can gracefully incorporate knowledge about the target speaker to further enhance the segregation.Through a series of EEG experiments we collect neurological evidence to support the principle behind the model. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of the remarkable perceptual ability of humans to segregate acoustic sources, and of its psychophysical manifestations in navigating complex sensory environments. Results from EEG experiments provide further insights into the assumptions behind the model and provide motivation for future single unit studies that can provide more direct evidence for the principle of temporal coherence. NEUROMORPHIC MODEL FOR SOUND SOURCE SEGREGATION
منابع مشابه
Sound stream segregation: a neuromorphic approach to solve the “cocktail party problem” in real-time
The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-t...
متن کاملA Neuromorphic Monaural Sound Localizer
We describe the first single microphone sound localization system and its inspiration from theories of human monaural sound localization. Reflections and diffractions caused by the external ear (pinna) allow humans to estimate sound source elevations using only one ear. Our single microphone localization model relies on a specially shaped reflecting structure that serves the role of the pinna. ...
متن کاملA NEUROMORPHIC MICROPHONE FOR SOUND LOCALIZATION By CHIANG-JUNG PU A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA
of Dissertation Presented to the Graduate School of the University of Florida in Partial Ful llment of the Requirements for the Degree of Doctor of Philosophy A NEUROMORPHIC MICROPHONE FOR SOUND LOCALIZATION By Chiang-Jung Pu May 1998 Chairman: Dr. John Harris Major Department: Electrical and Computer Engineering Despite many decades of research in human sound localization, exactly how humans c...
متن کاملThe story of AudioSapiana
AudioSapiana is a listening and walking robot designed by the audio group at the 2005 Neuromorphic Workshop. We added ears to a RoboSapien robot from Wow-Wee Entertainment to help it navigate in a difficult, obstacle-strewn environment. Previously, robots that oriented themselves towards a target beacon were designed to work in quiet environments. Audiosapiana, however, was designed so she coul...
متن کاملSpatial Hearing Algorithms Based on Binaural Zero-Crossings: Sound Source Localization, Segregation, and Dereverberation
This thesis concerns a new zero-crossing-based binaural model for spatial hearing. Conventional binaural model computes cross-correlations of binaural signals for the estimation of the interaural time difference which is a primary spatial cue. However, the cross-correlationbased binaural processing model requires high computational complexity and suffers from inaccuracies in localizing sound so...
متن کامل